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Abstract — We examine codes, over the additive Gaussian noise 
channel, designed for reliable communication at some specific 
signal-to-noise ratio (SNR) and constrained by the permitted 
minimum mean-square error (MMSE) at lower SNRs. The 
maximum possible rate is below point-to-point capacity, and 
hence these are non-optimal codes (alternatively referred to as 
"bad" codes). We show that the maximum possible rate is the 
one attained by superposition codebooks. Moreover, the MMSE 
and mutual information behavior as a function of SNR, for any 
code attaining the maximum rate under the MMSE constraint, is 
known for all SNR. We also provide a lower bound on the MMSE 
for finite length codes, as a function of the error probability of 
the code. 

Index Terms — Gaussian channel, MMSE constrained codes, 
non-optimal codes, bad codes, superposition codebooks, I-MMSE, 
interference, disturbance. 

I. Introduction 

CAPACITY and capacity achieving codes have been the 
main concern of information theory from the very begin- 
ning. Trying to design capacity achieving codes is a central 
goal of many researchers in this field. Specifically, in point- 
to-point channels, for which a single-letter expression of the 
capacity is well known [1], the emphasis is given to the 
properties and design of capacity achieving codes. One such 
important property, derived in [2], has shown that the behavior 
of the mutual information between the transmitted codeword 
and the channel output, and thus also the behavior of the 
minimum-mean-square error (MMSE) when estimating the 
transmitted codeword from the channel output, both as a 
function of the output's signal-to-noise ratio (SNR), of "good" 
(capacity achieving) point-to-point codes are known exactly, 
with no regards to the specific structure of the code. 

Recently some emphasis has been given to the research 
of non-capacity achieving point-to-point codes [3], [4]. These 
codes, referred to as "bad" point-to-point codes [4], are heavily 
used in many multi-terminal wireless networks, and perform 
better, in terms of achievable rates, compared to point-to-point 
capacity achieving codes. Bennatan et. al. [3] have argued 
that such codes have inherent benefits that often make them 
better candidates for multi-terminal wireless communication. 
For example, in [2] it was concluded, through the investigation 
of the extrinsic information (EXIT) behavior, that "good" 
codes can not function well as turbo component codes, within 
an iterative belief-propagation decoding procedure. 

The first question that comes to mind is: What are these 
inherent benefits that make these codes better candidates for 
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multi-terminal wireless communication? It is known that "bad" 
codes can obtain lower MMSE at low SNRs as compared 
to "good" point-to-point codes [4]. The hypothesis is that 
this property is the inherent benefit of "bad" codes. Surely, 
lower MMSE at lower SNRs is meaningless in point-to-point 
communication, where all that matters is the performance at 
the intended receiver. However, in multi-terminal wireless net- 
works, such as cellular networks, the case is different. In such 
networks there are two fundamental phenomena: interference 
from one node to another (an interference channel), and the 
potential cooperation between nodes (a relay channel). In the 
interference channel, where a message sent to an intended 
receiver acts as interference to other receivers in the network, a 
lower MMSE implies better possible interference cancelation, 
and thus improved rates for the interfered user. In the relay 
channel, the goal of the relay is to decode the intended 
message, so as to assist the transmission. In this case, a 
lower MMSE assist when full decoding is not possible. The 
relay may then use soft decoding, as suggested in [3]. These 
two advantages have been the center of the investigation in 
[3], where two specific soft decoding algorithms, one for an 
interference scenario and the other for a relay scenario have 
been analyzed. It was shown that for "bad" LDPC codes, better 
achievable rates can be obtained, as compared to "good" point- 
to-point codes. 

The problem that motivated this work is the Gaussian 
interference channel, where the question of how to handle 
interference is still open. Surely, when the interference can 
be decoded, as in the case of strong interference, then joint 
decoding is the optimal scheme and attains capacity [5]-[8]. 
However, what should one do with an interference that can 
not be decoded. Should we treat it as noise? Should we 
partially decode it? This question has been the investigation of 
several works. As explained above, Bennatan et al. [3] claim 
that soft decoding is a useful compromise in cases where 
complete decoding would be desirable if possible, but is not 
required by the terms of the problem, and show that specific 
"bad" LDPC codes attain better rates compared to "good" 
point-to-point codes. In [9] the authors establish the capacity 
region of the if-user Gaussian interference channel, when all 
users are constrained to use point-to-point codes. The capacity 
region is shown to be achieved by a combination of treating 
interference as noise and joint decoding. A similar setting 
was also discussed in [10], and in [11] the question whether 
treating interference as noise is optimal was asked on a more 
elaborated system of a point-to-point channel interfering with 
a MAC. In [12] the authors examine the interference channel 
from the point of view of a single transmitter-receiver pair, 
being interfered. They proposed a strategy to determine the 
rate, by disjoining the set of interfering users into two disjoint 
subsets, namely the set of decodable interferences and the set 
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of non-decodable interferences. The authors show that, when 
assuming that all interferences are Gaussian, their strategy 
achieves capacity. Finally, in [13], the authors examined the 
alternatives to treating the interference as Gaussian noise, 
assuming the receiver knows the constellation set used by 
the interferer. This makes the interference plus noise a mixed 
Gaussian process. Under these assumptions the authors de- 
velop an achievable rate, with improved sum-rate as compared 
to the one obtained when using Gaussian codebooks and 
treating interference as Gaussian noise. 

In this work we examine a simplified scenario, as compared 
to the interference channel, in which we have only a single 
transmitter with a single intended receiver. The transmitted 
message reaches one or more unintended receiver, which 
are not interested in the transmitted message. The question 
asked is: if these unintended receivers wish to estimate the 
transmitted message with limited error, that is, some constraint 
on the MMSE, what is the maximum rate of transmission? 
The connection to the interference model is clear. Assuming 
that a good approach is to remove the estimated codeword, 
one can think of the MMSE as the remaining interference. 
Note that the model examined here is a simplified version 
as compared to the interference channel, as we have omitted 
the messages intended to each of the unintended receivers. 
However, we trust that this simplified model is an important 
building block towards the understanding of the interference 
channel, and specifically the analysis of coding schemes using 
partial interference cancelation. 

The importance of the problem examined here is also 
apparent from the results obtained. We show that the optimal 
MMSE-wise codebook (that is, the codebook attaining the 
maximum rate given the MMSE constraint) in the examined 
setting, is the Gaussian superposition codebook. It is well 
known that the best achievable region for the two-user interfer- 
ence channel is given by the Han and Kobayashi (HK) scheme 
[7]. This scheme uses partial decoding of the interfering 
message at the receiver. Rate splitting (that is, superposition 
coding) is a special case of the HK scheme, and is also point- 
to-point "bad" (see [3, Appendix VIII-C]). It was shown in 
[14] that these codes are close to optimal, and in fact are 
within one bit from capacity. Our results give an engineering 
insight to these good performance of the HK scheme. 

In parallel to our work, Bandemer and El Gamal [15] 
examined the same model but for the general discrete memo- 
ryless channel (DMC). Bandemer and El Gamal [15] chose to 
quantify the interference (the "disturbance") using the mutual 
information at each of the unintended receivers, rather then 
the MMSE. They provide the rate-disturbance region: given a 
constraint on the disturbance, the amount of information trans- 
mitted to the unintended receiver, what is the maximum rate 
that can be transmitted reliably? We elaborate and compare the 
two methods, specifically for the Gaussian channel, in section 
VII. 

More specifically, we are examining the transmission of 
length n codewords over a discrete memoryless standard 
Gaussian channel. 

Y = ^X + N (1) 



where N is standard additive Gaussian noise. The codewords 
are constrained by the standard average power constraint: 

1 " 

VxeC„ -V^ 2 <1 (2) 
n 

i=l 

where C„ stands for a code of length n codewords. 

We distinguish between channel outputs at different SNRs 
using the following notation: 

Y( 1 ) = ^X + N (3) 

and for a length n codeword we use the boldface notation: 

Y(j) = ^X + N. (4) 

Thus, the normalized mutual information between the input 
and the output will be noted as: 

I n {l) = \l{X;Y{ 1 )). (5) 

The remainder of this paper is organized as follows: in 
section II we give some preliminary definitions and results. 
The problem is formulated precisely in section III. The results 
are then given in the three separate sections: for a single 
MMSE constraint in section IV, for K MMSE constraints in 
section V and a lower bound on the MMSE for finite length 
codes is given in section VI. As stated above, a comparison 
with the work of Bandemer and El Gamal [15] is given in 
section VII, adhering to an I-MMSE prespective. We conclude 
our work and discuss future challenges in section VIII. 

II. Preliminary Definitions and Results 

Before formulating the problem precisely, in Section III, we 
wish to define and present several key ingredients. 

A. Non-Optimal Code Sequences 

We begin by presenting a family of non-optimal code 
sequences for which our solution is valid. 

Definition 1: A non-optimal code-sequence C = {C n }^_ 1 , 
for a channel with capacity C, is a code-sequence with 
vanishing error probability 

pn ™ Q 

where P™ is the error probability of the code C„, and rate 
satisfying 

lim -logM„ < C. (6) 

where M n is the size of code C„. Moreover, we require, 

MMSE C "( 7 ) MMSE C ( 7 ) (7) 

where MMSE c -( 7 ) = ±~Tr{£ x {l)) and £ x {l) is the MMSE 
matrix defined as follows: 

£x( 7 ) - E{ (X - E {X | ^X + N}) 

(X -E{X\^X + N}) T } (8) 

with the random variable X uniformly distributed over the 
M n codewords of C„. 

Note that the requirement in (7) is not very restrictive, as 
MMSE Cn (7) can be both upper and lower bounded by a 
function of P™( 7 ). The convergence of P™( 7 ) has been 
discussed in [16]. 
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B. The I-MMSE approach 

The approach used in order to provide insight into the 
MMSE constrained problem is the I-MMSE approach, this 
to say that we make use of the fundamental relationship be- 
tween the mutual information and the MMSE in the Gaussian 
channel and its generalizations [17], [18]. Even though we 
are examining a scalar Gaussian channel, the n-dimensional 
version of this relationship is required since we are looking at 
the transmission of length n codewords through the channel. 
In our setting the relationship is as follows: 

i r snr 

I n (sm) = -J^ MMSE c "( 7 )d 7 . (9) 

Restricting our observations to the family of non-optimal code 
sequences defined in Definition 1 we can take the limit as 
n^ooon both sides 

J(snr) = lim 7„(snr) = - / MMSE c ( 7 )d 7 . (10) 

where the exchange of limit and integration on the right- 
hand-side is according to Lebesgue's dominated convergence 
theorem [19], the fact that MMSE C ( 7 ) is upper bounded, and 
the condition in equation (7). 

The main property of the I-MMSE used in the sequel is 
an n-dimensional "single crossing point" property derived in 
[20] given here for completeness. This property is an extension 
of the scalar "single crossing point" property shown in [21]. 
The following function is a simplified version (sufficient for 
our use in this paper) of the function defined in [20]. For an 
arbitrary random vector X: 

2 

q(X,a 2 ,j) = -^^-Jr(£ x ( 7 )). (11) 
1 + a 7 

The following theorem is proved in [20], 

Theorem 1 ([20]): The function 7 \-> q(X , ct 2 , 7 ), defined 
in (11), has no nonnegative-to-negative zero crossings and, 
at most, a single negative-to-nonnegative zero crossing in 
the range 7 e [0,oo). Moreover, let snr € [0, 00) be that 
negative-to-nonnegative crossing point. Then, 

1) q(X,a 2 ,0)<0. 

2) q(X, a 2 , 7) is a strictly increasing function in the range 
7 e [0,snr o ). 

3) q(X, ct 2 , 7) > for all 7 <G [snr , 00). 

4) lim 7 _ HX ,g(.X",o- 2 ,7) =0. 

The above property is valid for all natural n, thus we may also 
take n — > 00. 

C. Superposition Coding 

An important family of non-optimal codes, that is, a family 
of codes that do not attain the point-to-point capacity, is 
that of Gaussian superposition codes which are optimal for 
a degraded Gaussian BC [1]. We refer to this family of 
codes as optimal Gaussian superposition codes. As will be 
shown in the sequel optimal Gaussian superposition codes 
are optimal MMSE-wise. We begin by formally defining two- 
layered optimal Gaussian superposition code. The extension 
of the definition to a general L-layered optimal Gaussian 
superposition codes (L > 1) is straightforward. 



Definition 2 ( [1]): Given a pair of SNRs, (snr , snri), 
where snr < snri, two-layered optimal Gaussian superpo- 
sition codes, are all codebooks that can be constructed as 
follows: 

. Choose a B e (0, 1). 

. Set R u = |log ( ) ■ Fil1 the first codebook Q = 
{ui, • • • , mmJ with M u i.i.d. Gaussian vectors of aver- 
age power 1 — 8 where M u — 2" R " . This is the common 
message. 

• Set R v = ilog (1 + /3snr 2 ). Fill the second codebook 
Qi — 1 " " ) v m v } with M v i.i.d. Gaussian vectors of 
average power 8 where M v — 2" R ". This is the private 
message. 

• Construct the third codebook by taking the sum C„ = 
CJJ + C^, for which the cardinality is, almost surly, equal 
to IQHQJ. Thus, the rate is, almost surely, equal to 

The analysis of this family (two-layers) was done by Merhav 
et. al. in [22, section V.C] from a statistical physics perspec- 
tive. As noted in [22], the MMSE of this family of codebooks 
undergoes phase transitions, that is, it is a discontinuous 
function of 7. The mutual information, 1(7), and MMSE C (7) 
of this family of codebooks are known exactly and given in 
the next theorem (for L = K + 1 layers). 

Theorem 2 (extension of [22] section V.C): A K + 1- 
layered optimal Gaussian superposition codebook designed 
for (snro, snri, • • • , snr^) with rate-splitting coefficients 
Bo > ■ ■ ■ > Rk-i nas the following I(-f): 

i|og(l + 7), if 0<7<snr 

!'°g (1^ Ifi=i i r&7^7 i ) + s'og (i + At) , 

< if snr; < 7 < snr; + i 

^log IlJL-1 1 + I'og (1 + /fe-xsnr*) , 

if snr K < 7 

(12) 

and the following MMSE C (7): 

{j^, 0<7<snr 
jffc, snr, < 7 < snr 4+ i . (13) 
0, snr K < 7 

Proof: An alternative proof to the one given in [22, 
section V.C] is given in the Appendix. ■ 
An example of a two-layered optimal Gaussian superposition 
code is depicted in Figure 1, and a 4-layered optimal Gaussian 
superposition code is depicted in Figure 3. 

III. Problem Formulation 

As stated, we are examining the scalar additive Gaussian 
channel, through which we transmit length n codewords. For 
this setting we investigate the trade-off between rate and 
MMSE. This trade-off can be formalized in two equivalent 
manners. The first: 
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number). What is the solution of the following optimization 
problem: 



max /(snr/f) 

MMSE c (sn ri 



s.t. 



< 



ft 



V»6{0,1,...,X-1} 



1 + ftsni-i 

for some $ e [0, 1], i € {0, 1, . . . ,K - 1}, such that 

K-l 

^2 fa < 1 and 



(3. 



K-l 



i=0 

< Pk-2 < ■ 



< Pi < Po 



Fig. 1, The mutual information and MMSE C (7) of a two-layered super- 
position code with (snro,snri) = (2,2.5) and /? = 0.4 and the mutual 
information and MMSE C (7) of an optimal code for rate snri. 
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Fig. 2. The mutual information and MMSE C (7) of a 4-layers superposition 
code with (snro, snri , snr2, snr3) = (0.8,1.7,2.2,3) and (j3o,Pi, fii) = 
(0.6,0.4,0.3) and the mutual information and MMSE C (7) of an optimal 
code for rate snr3. 



Assuming a pair of SNR points (snro, snri) where snro < 
snri, what is the solution of the following optimization prob- 
lem: 



max /(snri) 

1MSE c (snr ) < 



s.t. 



P 



1 + /3snr 



(14) 



for some P G [0, 1]. 

Alternatively, an equivalent form of the above problem is: 



min MMSE c (snr ) 



s.t. /(snri) 



> 2 lo 8(l 



asnri J 



(15) 



for some a G [0,1]. The exact connection between the two 
optimization problems, and the parameters /3 and a, will be 
made clear in Section IV. The problem can also be extended 
to the general K MMSE constraints as follows: 

Assume a K + 1 set of SNR points (snro, snri, ■ ■ ■ , snr^) 
such that snro < snri < • • • < snrx (K > 1 is some natural 



IV. Single MMSE Constraint 

In this section we present the main result of this paper, 
answering the following question: what is a maximum possible 
rate given a specific MMSE constraint at some lower SNR? In 
other words, we provide a solution to the optimization problem 
given in (14) (or alternatively, (15)). We first give the main 
results and then detail the proofs in the subsequent subsections. 

A. Main Results 

The main result is given in the next theorem. 

Theorem 3: Assume a pair of SNRs, (snro, snri) suc h that 
snr < snri. The solution of the following optimization 
problem, 



max 



s.t. 



for some /3 G [0, 1] 



/(snri 



'Ml 



/(snri) 

MMSE c (snr ) < 
is the following 
/3snri 



1 + /3snr 



-log 
2 & 



1 + snr 



(16) 



(17) 



1 + /3snr 0/ 

and is attainable when using the two-layered optimal Gaussian 
superposition codebook designed for (snro, snri) with a rate- 
splitting coefficient /?. 

The proof of this theorem is given in subsection IV-B. 

An interesting question to ask is whether there could be a 
different code that can attain maximum rate under the MMSE 
constraint at snro (16) and also provide better MMSE for other 
values of SNR. The answer is to the negative, and is given in 
the next theorem. 

Theorem 4: From the set of reliable codes of rate R c = 
i|og(l + /3snn) + |log(j^^), complying with the 
MMSE constraint at snr , the two-layered optimal Gaussian 
superposition codebook designed for (snro, snri) with a rate- 
splitting coefficient (3, provides the minimum MMSE for all 
SNRs. 

B. Proof of Theorem 3 

Proof: It is simple to verify that the two-layered optimal 
Gaussian superposition codebook designed for (snro, snri) 
with a rate-splitting coefficient /3, complies with the above 
MMSE constraint and attains the maximum rate. Thus, the 
focus of the remainder of the proof is on deriving a tight upper 
bound on the rate. We first prove the equivalent optimization 
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problem, depicted in (15), and derive a lower bound on 
MMSE c (snro) given a code, designed for reliable transmission 
at sni"! of rate R c = |log (1 + asnri). 

If asnri < snro < 1 the lower bound is trivially zero using 
the optimal Gaussian codebook designed for asnri. Thus, we 
assume snr < asnri. 

Using the trivial upper bound on /(snro) < ^ log (1 + snr ) 
(due to maximum entropy), we can lower bound the following 
difference, for any snr < asnri: 

/(snri) - 7(snr ) > /(snri) - ^log (1 + snr ) . (18) 

Using the I-MMSE relationship (10), the above translates to 
the following inequality: 



C. Proof of Theorem 4 

Proof: The code complies with the following constraint: 



(19) 



i r smi i 

-/ MMSE c ( 7 )d 7 > R c - -log (1 + snr ) 

i /snr ^ 

= ^log (1 + asnri) - ^log (1 + snr ) 
Defining d through the following equality: 

^log (1 + asnri) - ^log (1 + snr ) = 

ilog (1 + dsnn) - llog (1 + dsnr ) . (20) 

it is simple to check that for snr < asnri, d is in the range 
of (0, 1). Now we can continue with equation (19): 



2 / >sm 'i 

2 Jsnr 



1 



1. 



MMSE (7) d7 > -log (1 + dsnri) - -log (1 + dsnr ) 



2 Jsnr 



m msec (7) d.7. 



(21) 



where m msec (7) is the MMSE assuming a Gaussian random 
variable with variance d transmitted through the additive 
Gaussian channel at SNR equal to 7. The single crossing point 
property (Theorem 1) tells us that MMSE C (7) and mmse G (7) 
cross each other at most once, and after that crossing point 
m msec (7) remains an upper bound. From the inequality in 
(21) we can thus conclude that the single crossing point, if 
exists, must occur in the region (snr ,oo). Thus, for snr we 
have the following lower bound: 



MMSE c (snr ) 



> 



d(snr ) 
1 + d(snr )snr 



asnri — snr 1 

snri — snr 1 + snr 

(22) 



Note that d(-) is a function of snr . 

In terms of the equivalent optimization problem, given in 
equation (14), the case of asnri < snr is equivalent to a zero 
constraint on the MMSE, that is, (3 = 0. For (3 e (0, 1] the 
lower bound derived in (22) can be written in terms of the 
constraint on MMSE, resulting with the following connection 
between the two parameters: 



j8(snri - snr ) + snr (l + /3snr x ) 
snri(l + /3snr ) 



(23) 



Substituting this connection in R c = |log(l + asnri) results 
with the superposition rate given in (17). ■ 



MMSE c (snr ) < mmse G (snr ) 



1 + /3snr 



(24) 



where mmse G (snro) denotes the MMSE of the estimation of a 
Gaussian random variable, Xq, with zero mean and variance 
/3, from Y = y/sn^X G + N, where N ~ W(0, 1). Thus, 

q(X, P, snr ) = mmse G (snr ) - MMSE c (snr ) > 0. (25) 

According to Theorem 1 the function q(X,(3,j) has no 
nonnegative-to-negative zero crossings, thus we may conclude 
that, 

q(X, /3, 7) > MMSE C (7) < mmse G (7) V7 > snr 

(26) 

and derive the following upper bound, 

1 /■snri 

/(snri) - /(snr ) = - / MMSE c ( 7 )d7 

^ /snr 
^ /-snri 

< - / mmse G (7)d7 

Z /snr 



1. /l + /3snn 

-log 

2 & \l + psm 



(27) 



On the other hand, since we are assuming a code that attains 
the maximum rate we can lower bound the above difference 
using the maximum entropy theorem, 



/(snri) - /(snr ) > R c - ^log (1 + snr ) 
2 ° S I 1 + /3snr 



(28) 



Thus, we conclude that any code complying with the MMSE 
constraint and obtaining the maximum rate obtains the above 
two inequalities with equality. In order to attain the upper 
bound, 

- MMSE c (7)d7 = - / mmse G (7)d7 

^ /snr n ^ /snr n 



however, due to (26) we have, 

MMSE C (7) = mmse G (7), V7 e [snr ,snri] 

In order to attain the lower bound, given that /(snri) = R c > 
we require, 

^(snr ) = ^log(l + snr ) 

which guarantees MMSE C (7) = |log(l + 7) for all 7 e 
[0,snr o ]. Finally, for 7 e [snri, 00), since we assume code- 
books that are reliably decoded at snri, MMSE C (7) = 0. To 
conclude, we have shown that for any code complying with 
the MMSE constraint and attaining the maximum rate, the 
MMSE C (7) function is defined for all 7 <G [0, 00), and thus 
also the mutual information. This concludes our proof. ■ 
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V. Multi-MMSE Constraints 



B. Proof of Theorem 5 



In this section we extend the results for the single MMSE 
constraint, given in the previous section, to K MMSE con- 
straints, and examine the same question: under these K MMSE 
constraints, what is the maximum possible rate? 



A. Main Results 

The main result of this section is given in the next theorem. 

Theorem 5: Assume a set of SNRs, (snr ,snri, • • • ,snr^) 
such that snr < sn^ < • • • < snr^ (K > 1 is some natural 
number). The solution of the following optimization problem, 



max /(snrx) 
s.t. MMSE c (snr,) < 



A 



V* e {o,i,..., A" - 1} 



1 + ftsnr 4 

for some ft e [0, 1], i e {0, 1, . . . , K - 1}, such that 

K-l 

ft < 1 and 

Pk-1 < Pk-2 < ■ ■ ■ < A < A) 

is the following 



1 / 1 + snro -p-r l + ft_isnr, 
I(sm K ) = -log — — I 

2 \ 1 + ftsnro 1L 



+ 



-log(l + ftr-isnr K ) (29) 

and is attainable when using the optimal if -layers Gaussian 
superposition codebook designed for (snr , snri, • • • , snr^) 
with rate-splitting coefficients (ft), ■ ■ ■ , /3k-i)- 
Additional constraints of the following form: 



MMSE c (snr £ ) < 



A 



1 + ftsnr^ 



(30) 



for snrj_! < snr^ < snr; when ft > ft_i, do not affect the 
above result. 

Theorem 5 states that if -layers superposition codes attain 
the maximum possible rate at snr^ under a set of K MMSE 
constraints at lower SNRs. However, there might be a different 
codebook with this property, which also has some other 
desirable properties. In the next theorem we prove that the 
behavior of the MMSE and the mutual information as a 
function of the snr, for any code attaining the maximum rate 
under the set of MMSE constraints, is known for all snr, and 
are those of if-layers superposition codes. Thus, no other code 
can outperform superposition codes in this sense. 

Theorem 6: The MMSE C (7) (and thus also 1(7)) of any 
code attaining the maximum rate at snr^, under the MMSE 
constraints, defined in Theorem 5, is known for all < 7, and 
is that of the if -layers superposition codebook. 



Proof: It is simple to verify that the optimal Gaussian K- 
layers superposition codebook (Theorem 2) complies with the 
above MMSE constraints and attains the maximum rate. Thus, 
we need to derive a tight upper bound on the rate. Deriving 
the upper bound begins with the usage of Theorem 3. Due to 
the constraint at snr : 



MMSE c (snr ) < J° 

1 + p snr 

we have the following upper bound 



(31) 



/(snn) < hog (1 + ftsnn) + hog ( ) ■ (32) 

2 2 \l + ftsnr / 

The other constraints, for i £ {1, 2, . . . , K — 1}, can be written 
as follows, 



MMSE c (sn ri ) < 



1 + ftsni-i 



mmseG^snrj) (33) 



where mmse^ (snrj) denotes the MMSE of the estimation of a 
Gaussian random variable, X G . , with zero mean and variance 
ft, from Y = v /snt r l X Gt + N, where N - Af(0, 1). Thus, 

g(X,ft,snr 4 ) = mmse Gi (snr 4 ) - MMSE c (snr J ) > 0. (34) 

According to Theorem 1 the function q(X,/3 i} j) has no 
nonnegative-to-negative zero crossings, thus we may conclude 
that, 

<?(X,ft, 7 ) > <=> MMSE C ( 7 ) < mmse Gi ( 7 ) V7 > snr;. 

(35) 

This allows us to provide a tight upper bound on the following 
difference: 

J(snr <+1 ) - i(snr 4 ) = \ f MMSE c ( 7 )d 7 



< 



snr i+ i 



mmseG i (7)d7 
1 + ftsnr m * 



2 l0g V I - >';snr 



Now, we can write the objective function as follows: 

K-l 

i(snr^) = /(snri) + [7(snr i+ i) - i(snr,)] 

i=l 



(36) 



(37) 



Using (32) and (36) we can bound (37) as shown in (38) at 
the top of the next page. 

Now, according to (35) we have that any additional con- 



straint, MMSE c (snr£) < T 



for snr;_i < snr^ < snr; 



when ft > ft_i, is already complied with, since 



MMSE c (snr £ ) < - 



< 



(39) 



+ ft_isnr £ 1 + ftsnr^ 
and thus, does not affect the result. This concludes our proof. 



7 



1. 1. ( l + snr \ v-^ 1 1, fl + /3jsnr i+ i \ 

I(snr K ) < -log (1 + ftsnrO + -log {^^^ j + g -log J 

1, / 1+snrp \ 1 (d + a ) 1 + / 3 i snr 2 1 + fesnr 3 1 + /3 3 snr 4 1 + pV„ 2 snr g _i 1 + /3 g _isnr Ar \ 
" 2 ° g ^ 1 + /3 snr J 2° g \ ( P " Snri) 1 + ftsnn 1 + A>snr 2 1 + /3 3 snr 3 " ' 1 + pV_ 2 snr K _ 2 1 + pV-isnr K -i / 

1 / 1 + snro \ 1 /l + ^ snri 1 + /?isnr 2 1 + /3 2 snr 3 1 + /3if_ 2 snrK-i \ 1 

= 9 l0g 1 I « cnr + o'°g 1 . cnr 1 I fl cnr 1 I fl cnr TXfl ^Tr + ^Og (1 + 0K -lSM K ) 

2 \l + /5 snr / 2 \ 1 + pisnri 1 + p 2 snr 2 1 + p 3 snr 3 1 + pK_isnrx-i / 2 

1, / l + snr i ^-r 1 1 + /3 7 _isnr, \ 1, , /ooN 

2 11 + posnro 1 + pjSnrj / 2 



C. Proof of Theorem 6 

Proof: Due to the set of if constraints and following the 
steps that lead to (35) in the proof of Theorem 5 we can 
conclude that 



1MSE C (7) < mmse Gs (7) 



ft 



1 + &7 



, V7 > snr, (40) 



for i S {0, 1, 2, . . . , K — 1}, where mmse^ (snr^) denotes the 
MMSE of the estimation of a Gaussian random variable, X G . , 
with zero mean and variance from Y = ^/snTiXa. + N, 
where iV~jV(0,l). 

In the proof of Theorem 5, equation (36), we have seen 
that the above property can be used to construct the following 
upper bounds 

7(snr i+ i) - J(snrj) < - / mmse Gi (7)d7 



1 / 1 + /3jsnr i+ i 

2 ° g I 1 + ftsnr, 



From these upper bounds we can obtain the following 



(41) 



I(snr K ) - I(sn 



1 /-snrj 

ro) = 2 / 



MMSE c (7)d7 



K-l 



E 



snr i + i 



MMSE L (7)d 7 

i=0 

< Yl o / mmse Gi (7)d 7 



i=0 • /Snr i 
if-1 



i=0 v 



1 + ftsnrj+i 



Asnr, 

1. ttV 1 + A snr i+i\ 
= 2 l0g n TTfenrT • 



(42) 



On the other hand, we can lower bound the above difference: 
/(snrjf) - J(snr ) > R c - ^log (1 + snr ) 

= 2 log n (TT^nT-J (43) 

where we used both the assumption that the code attains the 
maximum rate at srir^, under the MMSE constraints (Theorem 



5), and the maximum entropy theorem to obtain the maximum 
mutual information at snr . From (42) and (43) we have 



for any code attaining the maximum rate at snr^ under the 
MMSE constraints, given in Theorem 5. Looking at the upper 
bound (42), this equality can be attained only if 



If 



snr i+ i 



MMSE c ( 7 )d7 



snr i+ i 



mmseG i (7)d7, (45) 



snri 



for alH e {0, 1, . . . , K — 1}. Due to (40) this is equivalent 
to MMSE C (7) = mmse Gi (7) = j^-^ for all snr; < 7 < 
snr^+i. Thus, we defined the function MMSE C (7) for all 
7 € [snro, snr^f]. Surely since this is a reliable code designed 
for snr K , we also have that MMSE C (7) = for all 7 > snr K . 
The only region that remains to be determined is 7 € [0, snr ]. 
Since the lower bound, (43), is attained with equality and 
/(snrx) = R c we have 



7(snr ) = -log(l + snr ) 



(46) 



which guarantees that MMSE C (7) 
This concludes our proof. 



1+7 



for all 7 e [0, snr ]. 



VI. Finite Length Code 

We now extend the single MMSE constraint result, given in 
section IV, to the case of finite length codes. In this case the 
code is not fully reliable, but rather has a small probability 
of error, denoted as P e . In the case that this error probability 
in unknown precisely, one may upper bound it using basic 
properties of the code [23]. 

Corollary 1: Assume a finite length code of rate R c = 
I log (1 + asnrx), designed for transmission at sni"! with error 
probability P e . For any snr < asnri we have the following 
lower bound, 

MMSE c "(snr ) > 

1 + asnri - (1 + snr )2i h "( p ^ (1 + asnri) p = 

2ii h i>( p e)(l + asnri) p = [snri - snr + snr (sni"i - snr )] 

(47) 

where hb (•) stands for the binary entropy function. 
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VII. The Mutual Information "Disturbance" 
Measure 

Bandemer and El Gamal [15] suggested a difference mea- 
sure of "disturbance" to a receiver not interested in the trans- 
mitted message. Bandemer and El Gamal examined discrete 
memoryless channels and derived a single-letter expression for 
the problem with a single "disturbance" constraint. Applying 
the single-letter to the scalar Gaussian case they obtain the 
following result, 

Corollary 2 ([15]): The rate-disturbance region of the 
Gaussian channel for the pair of SNRs (snr ,snri) is 



Fig. 3. The lower bound on the MMSE Cn of a regular (6,12)-LDPC code 
of length n = 5K, R c = 0.5, and P c = 1CT 5 at snri = 2.5179 (data taken 
from [23, pp. 78]), given for < 7 < asnri = 1 (in solid). The uncoded 
MMSE is given in dashed. 



Proof: Due to Fano's inequality [1] we have, 

7(snn) = R c -ih(X|V(snri)) 

n 

> R c --h b (P e )--P e \og (2 

n n 

> ilog (1 + asnri) - -\o g 2^ p «> - ^log2 
= -log (1 + asnri) - hog2^ p ^ - Uog (1 + asnri 



R < 2 '°S (1 + asnri) 
Rd > ^log (1 + asnr ) 



(49) 



for any a € [0, 1]. The maximum rate is attained by an optimal 
Gaussian codebook designed for snri with limited power of 
a. 

Proof: The above result, which originally has been proved 
by the entropy power inequality [15], can also be derived 
directly from the I-MMSE formulation. Starting from the 
disturbance rate, since 



->?iR c 



1 



1. 



->2P e R c 



< /„(snro) < ^log (1 + snr ) 



(50) 



= ilog [(l + asnn) 1 -^-^^ 



(48) 



Now, using this lower bound in (19) we obtain a new value 
for the parameter d, 

_ 1 + asnri - (1 + snr )2* h "( p =) (1 + asnri) Pc 
2^ h b{Pa) (1 _|_ aS nri) Pc snri(l + snr ) — snr (l + asnri) 

Placing the above in the lower bound of (22) we obtain the 
desired result. This concludes our proof. ■ 

Remark 1: Note that contrary to the case of n — > 00, 
since the code is not fully reliable, we do not have that 
MMSE Cn (7) = for all 7 > snri. Furthermore, we do not 
have a trivial lower bound of zero for 7 > asnri. 

As an example for the above lower bound we can examine 
regular LDPC codes, for which the tangential-sphere bound 
(TSB) provides a good upper bound on P e [23]. Using the 
results of [23, pp. 78], we have that a regular (6, 12)-LDPC 
code of block length n = hK and rate R c = 0.5, obtains 
P e = 10~ 5 at snri = 2.5179 and asnr x = 1. The lower 
bound of Corollary 1, for 7 < asnri, is given in Figure 3 (in 
blue), together with the uncoded MMSE [17, eq. (17)], which 
provides an upper bound (in red). Note that for "bad" LDPC 
codes, tighter upper bounds can be provided using Belief- 
Propagation analysis (or the I-MMSE approach) [4]. However, 
these upper bounds improve the upper bound for SNRs nearing 
snri (for which the lower bound of Corollary 1 is useless) and, 
on the other hand, for low SNRs consolidate with the upper 
bound (depicted in red in Figure 3) [4]. 



there exists an a € [0, 1] such that, 

f< 1 

7„(snr ) = -log (1 + asnr ) . 

Using the I-MMSE approach, the above can be written as 
follows, 



(51) 



1MSE c "( 7 )d7 



1 



mmseG(7)d7. 



According to Theorem 1 we conclude that MMSE C (7) and 
m msec (7) are either equal for all 7, or alternatively, cross 
each other once in the region [0, snr ). In both cases we have, 

MMSE Cn (7) < mmse G (7), V7 e [snr , 00). (52) 

Now, upper bounding the rate, 

1 /-snri 

/„ (snri) = -log (l + asnr )+ / MMSE c "( 7 )d7 

* isnr 

< i|og(l + asnn) (53) 

This concludes the I-MMSE based proof. ■ 
Extending Corollary 2 to K mutual information disturbance 
constraints, in the Gaussian channel, is trivial since only one 
of the constraints remains effective. The result is given in the 
next corollary. 

Corollary 3: Assume a set of SNRs, (snro,snri, 
••• ,snr/<-), such that snro < snri < ■•• < snr^. The 
solution of 

max I n (snrx) 

s.t. Vi e {0, ■ • • ,K - 1}, 7„(snri) < ^log (1 + a^nr*) 
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for some values on G [0,1], is the following 

I n (sm K ) = ^log (1 + a e sm K ) 

where at, £ G {1, • • • , K — 1}, is defined such that 

Vie {(),■■■ ,K-1} i|og(l + a/snri) < ^log (1 + a 4 snr 4 ) 

The maximum rate is attained, for any n, by choosing the 
input to be Gaussian with i.i.d. components of variance on. 
For n — > oo equality is also attained by an optimal Gaussian 
codebook designed for snr^ with limited power of a e . 

VIII. Conclusions and Discussion 

In this work we quantify the advantage of "bad" point- 
to-point codes, in terms of MMSE. These codes, that do 
not attain capacity, are heavily used in multi-user wireless 
networks. We show that the maximum possible rate of an 
MMSE constrained code is the rate of the corresponding 
optimal Gaussian superposition codebook. We also show that 
the MMSE and mutual information behavior as a function of 
SNR of any code attaining the maximum rate under the MMSE 
constraint, is known for all SNR. The result are then extended 
to K MMSE constraints. We also provide a lower bound on 
the MMSE of finite codes. 

As stated in the Introduction, the single MMSE constraint 
result provide the engineering insight to the good performance 
of the HK superposition scheme on the two-user interference 
channel, as shown in [14]. Our results, showing that the HK 
superposition scheme is optimal MMSE-wise suggest that one 
cannot construct better codes of the type defined in [3] that will 
beat HK through the use of estimation. Note that, as mentioned 
in [3, section V], the codes constructed there have an important 
complexity advantage over HK codes. 

The HK scheme is efficient in the two-user interference 
channel and only a simple approach in the general if-user 
interference channel. In other words, the MMSE-wise opti- 
mality of this scheme for the K MMSE constrained problem 
is not sufficient to guarantee an efficient coding scheme. The 
reason being that the K MMSE constrained problem is a 
huge simplification of the interference channel, as only a 
single message is transmitted and creates interference to K 
receivers, whereas in the if -user interference channel, each 
receiver suffers interference from all other K — 1 receivers. 
As well known, the interference alignment approach obtains, 
for certain interference channel coefficients, better results, in 
terms of rate and degrees of freedom, as compared to the HK 
scheme in the if -user interference channel. It has been shown 
that I-MMSE considerations based on information and MMSE 
dimension are useful also in these kind of problems, see [24], 
[25] and references therein. 

In the previous section we have shown that the different 
disturbance measure suggested by Bandemer and El Gamal 
[15] does not suggest rate-splitting in the scalar Gaussian 
channel, but rather an optimal Gaussian codebook of reduced 
power. Moreover, the extension to K constraints reduces to 
a single effective constraint and also suggests an optimal 
Gaussian codebook of reduced power. On the other hand, the 



results of Bandemer and El Gamal are valid for any finite n as 
opposed to our results which are given only for n — > oo. To 
conclude, the two measures of disturbance are conceptually 
different. Finally, Bandemer and El Gamal [15] also extended 
their work to the MIMO Gaussian channel, where optimality 
does requires rate-splitting codebooks. One of our challenges 
is to extend the MMSE constrained problem to the MIMO 
Gaussian channel. Note also that the results of Bandemer and 
El Gamal for the Gaussian channel 

The main challenge, that also has significant implications 
on the design of actual codes, is the extension of the results 
given above to the finite n case. In other words, what is 
the maximum mutual information given a constraint on the 
MMSE of a finite length code. This optimization problem is 
also interesting for n = 1, where no code is considered. It was 
conjectured in [26] that for the n = 1 case, the optimizing, 
finite variance, random variable is discrete. 

In this work we proved that under MMSE constraints at 
lower SNRs, the optimal code, when n -> oo, attaining 
maximum rate is a superposition codebook. This raises an- 
other challenge: what is the maximum possible rate if we 
further limit the discussion to single structured codes (still 
at n — > oo)? In other words, what is the solution of the given 
optimization problems if we add an additional constraint that 
the MMSE curve does not exhibit phase transitions, and is 
continuous until snr^-? 

Appendix 

A. Proof of Theorem 2 

Proof: The proof given here is an elaboration of the 
last paragraph in [22, section V.C, The Gaussian Broadcast 
Channel], which provides the optimal Gaussian BC codebook 
viewpoint. We prove only the expressions of the two-layered 
optimal Gaussian superposition codebook. The extension to 
the general i-layers (L > 1) is straightforward. 

Using the definition of an optimal Gaussian superposition 
codebook given in Definition 2, we have a Markov chain, 
(U, V)—X—Y('y), and the mutual information can be written 
as follows: 

I n { 1 ) = U{X-Y( 1 ) = ^X + N) 
= h(U,X;Y( 1 )) 

= h(U-Y( 1 )) + ^I{X-Y( 1 )\U). (54) 

We want to derive the limit, as n — > oo, of the above expres- 
sion. As we are examining a two-layered optimal Gaussian 
superposition codebook we have a pair of relevant SNR points, 
(snro,snri). We begin by examining I (U;Y(-f)) at SNRs 
below snr , for n — >• oo. At these SNRs the private message 
acts as additive Gaussian noise, since otherwise one could 
take advantage of that and transmit the common message at a 
higher rate, contradicting the capacity of the scalar Gaussian 
BC. Thus, we have, for n — > oo, 

I(U;Y( 7 ))=l(u; ] f^ i U + N ) ) (55) 
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where N is standard Gaussian noise. Since U is a codewords 
from an optimal Gaussian codebook with power 1 — 0, (55) 
was determined in [2], and is, 



lim 1 I{U;Y{ 1 )) 

n—>oo fi 



2 S V 7/3 + 1 
1. / 1+7 



(56) 



for 7 < snr (for 7 = snr we have exactly the scalar Gaussian 
BC limit, thus we can see that without the assumption on 
the private message acting as Gaussian i.i.d. noise, one could 
exceed this limit). For 7 > snr the mutual information flattens 
and equals to the rate of the codebook. 

Going on to the second term in (54) we have: 



I{X-Y{ 1 )\U)=I{V;^V + N) 



(57) 



which is again the mutual information of an optimal Gaussian 
codebook, this time with power 0, 



lim -I(X:Y(j)\U) 

n— >oo ft 



1 



log (1 + 07) 



(58) 



This value remains valid for all 7 < snri. For 7 > snri the 
above mutual information flattens and equals to the rate of this 
code. Adding the two terms together we obtain the desired 
expression (12). 

Now we turn to examine the derivative of I n (7) with respect 
to 7 (which is up to a factor of \ the MMSE Cn (7)): 



d7 



_d_l 
d7 n 
_d 

d7 n 



l{U;Y{i)) 



_d_ 1 

d7 n 



I(X;Y{i)\U) 



\l (U: ^U + N)+j- l -I (V; + N) 

(59) 



where 7 



7/3+1 



Examining the first expression on the right- 



hand-side we can use the chain rule. The derivative with 
respect to 7 is known [2] since we have an optimal Gaussian 
codebook of power 1—0 transmitted over an additive Gaussian 
channel: 



_d 

d7 n 



1 -l(U;^U + N) = ^ 1 -l(U;V-W + N) 
1 



7 



d 7 7 + 1 



1-0 



2 1+7(1-0) (1 + 70) 2 

1 1-0 

2 (l + 7 )(l+ 7i 9) ■ 



(60) 



This is valid for 7 < snr after which the MMSE falls to 
zero. The second expression on the right-hand-side is again 
an optimal Gaussian codebook of power transmitted over 
an additive Gaussian channel for which the derivative is the 
MMSE with known behavior [2]: 

d 1 10 



- l I{V;^V + N) = \ 
d7 n 2 1 + 70 



7 < snri. 



At 7 = snri the above expression falls to zero. Putting the 
two together we obtain the desired result of equation (13). 
This concludes the proof. ■ 
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